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ABSTRACT 

An  examination  of  software  system  safety  analysis  has 
been  made  and  generalized  techniques  examined.   These  tech- 
niques parallel  the  techniques  used  for  hardware  analysis 
and  are,  in  fact,  predicted  on  the  fact  that  the  only  safety 
perturbation  in  software  is  one  that  directs  or  misdirects  a 
hardware  component.   Discussion  is  presented  for  a  top  to 
bottom  and  a  bottom  up  hierarchical  analysis,  as  well  as  an 
integrated  technique. 


I.       BACKGROUND 

Although,   the   safety   of  the  product  has    always    received 
some    consideration,    albeit  possibly   tacit,    the   formal, 
systematic  safety  programs   as  we  know  them  today  did  not   come 
into  being   until   the  early    19  60's.      The   one  exception   to    this 
was    the   very   strict   safety   controls   established  by   the  Atomic 
Energy   Commission   on   the  use   and  exposure   to  nuclear  materials. 

The    first   system  safety   documentation   requirement  was    the 
United  States   Air  Force  Ballistic  Systems   Division  Exhibit 
62-41,    "System  Safety  Engineering   for  the  Development  of  Air 
Force   Ballistic  Missiles",    published  in  April   1962.      This 
document  established  System  Safety  requirements   for  the  Asso- 
ciate  Contractors   on   the  Minuteman  missile  program. 

In   September   19  63,    the   United  States   Air  Force   Specifica- 
tion,   MIL-S-38130    (USAF) ,    "General   Requirements    for  Safety 
Engineering   of   Systems    and  Associated  Subsystems   and  Equipment" 
was   promulgated  as   the  first  military   requirement   for   the 
engineering   safety  of  general  systems.      This   specification  was 
closely    followed  in   October   19  63  by    the  Navy's    similar    (and 
nearly   duplicate)    requirement,   MIL-S-38130 (WEPS) .      These   two 
documents  were    later  merged  into  a   joint  specification, 
MIL-S-38130 (ASG) ,    and  in  June   1966,    this    specification  became 
a   Department  of   Defense    (DoD)    requirement,    MIL-S-38130A. 

In   July   1969,    the  System  Safety   Specification  was   revised 
into   a  Military  Standard,   MIL-STD-882    and   in  July    1977, 


MIL-STD-882A  expanded  and  defined  in  more    specific   terms   the 
System  Safety  Program  requirements. 

In   December   19  78,    DoD  Instruction   5000.36,    "System 
Safety  Engineering   and  Management"    stated  that:      "The   Heads 
of  DoD  components  shall  establish  system  safety  programs   and 
apply  Military  Standard   882A. ...for  each  major  system  acqui- 
sition of  other   systems    and   facilities,    as    appropriate,   based 
on   the   severity   of   associated  hazards    and   the  potential   for 
loss    or   damage . . . " . 

The   purpose  of   the   System  Safety  Program,    as    stated  in 
MIL-STD   8  82A  is    "To  provide   uniform  requirements    for 
developing  and  implementing   a  system  safety  program  of 
sufficient   comprehensiveness    to  identify   the  hazards   of  a 
system  and  to  ensure   that  adequate  measures   are   taken   to 
eliminate   or   control   the  hazard". 

The  general   requirements   of   the   System  Safety  program 
are    that   safety,    consistent  with  mission   requirements,    is 
built   into   the   system  in   a   timely,    cost  effective  manner; 
that  hazards   associated  with  each   system  are    identified, 
evaluated,    and  eliminated  or  controlled   to    an    acceptable 
level   throughout   the  entire    life   cycle   of   the   system;    and 
that   retrofit  actions   required  to   improve   safety   are   mini- 
mized through   the   timely   inclusion   of   safety    features    during 
the   development  and  acquisition   of   a   system. 

Each  of   the   above    listed  requirements   deserves    special 
attention   and  comment.      For  example,    it   is    to  be  noted  that 


the  Military  Standard  calls  for  safety  consistent  with 
mission  requirement,  not  "safety  at  any  cost".   The  util- 
ity of  the  system  in  the  expected  use  environment  is  still 
the  prime  factor  for  consideration,  and  the  design  safety 
is  intended,  not  to  inhibit  the  mission,  but  rather  to  en- 
hance the  accomplishment  of  the  mission. 

It  is  also  stated  that  hazards  are  to  be  identified, 
evaluated  and  eliminated  or  controlled  to  an  acceptable 
level.   This  is  the  essence  of  what  might  be  called  the 
System  Safety  Process.   The  use  of  historical  data,  simu- 
lation, synthesis  and  test  and  evaluation  are  required  to 
identify  hazards  of  the  system  while  the  system  is  still 
in  the  design  process. 

The  goal  of  this  action  is  to  minimize  the  risk,  based 
on  hazard  severity,  hazard  probability,  criticality,  cost, 
time,  resources  and  mission,  throughout  the  lift  cycle 
including  disposition  and  disposal. 

By  inserting  the  safety  process  as  early  as  possible 
in  the  design  (and  even  concept)  process,  costly  retrofit 
actions  are  reduced  at  a  great  savings  of  money,  operational 
usage  and  limited  resources.   This  takes  safety  from  its 
prior  "Band- Aid"  approach  of  "try  it  and  then  fix  it"  into 
a  new  regime  of  designing  the  safety  into  the  original 
product. 


II.   RISK  ASSESSMENT 

The   general   requirements   of  MIL-STD   8  82A  include   the 
risk   assessment  procedures   of   conducting  hazard  analyses 
starting  in   the  Conceptual  Phase   and  proceeding  through   the 
production  phase.      These   analyses   include: 

1.  Preliminary  Hazard  Analysis   -   a  "broad  brush" 
look  at  the  potential  hazards   of  systems   and  subsystems. 
The  Preliminary  Hazard  Analysis    (PHA)    is   begun    long  before 
detailed  designs  begin  to   take   shape,    and,    although  quite 
qualitative   in  nature,    the  PHA  provides    a  base    for  other 
types  of  analyses. 

2.  Subsystem  Hazard  Analysis   -   Once   detailed  designs 
are   underway,    more   in-depth   analyses    can   be    conducted  on 
the   subsystems.      The   subsystem  may  be   a  single   component  or 
part,    or  it  may  be   a   complex  mini-system.      Frequently  the 
Subsystem  Hazard  Analysis    (SSHA)    is    conducted  on   a   fairly 
large  block,    and  only   if   the   analysis    indicates    a  high 
degree   of   criticality   is    the   analysis   extended  to   lower 
level   components.      This   is   necessary   due   to    the   time   and 
cost  involved  in   the   analysis  procedure. 

3.  System  Hazard  Analysis   -   Once    an   SSHA  has  been 
completed,    one  has   some    degree   of   confidence   that   the 
critical  hazards  have  been   identified  and  eliminated  or 
controlled  to   an   acceptable    level    in  each  of   the   applicable 
subsystems.      At  this  point   a  Systems   Hazard  Analysis    (SHA) 
is    conducted  to  investigate    the   safety   of   subsystem 


interfaces.  It  is  not  at  all  unlikely  that  the  interrela- 
tionships between  two  "safe"  subsystems  may  result  in  some 
unsafe  action  of  the  combined  system. 

4.   Operating  and  Support  Analysis  -  Although  it  is 
easy  to  forget  during  the  design/production  process  that 
the  ultimate  purpose  is  not  the  development  of  a  system, 
but  is  rather  the  operational  utilization  of  the  system, 
one  must  keep  in  mind  throughout  the  design/production 
phases  that  this  system  must  be  operated  and  supported  to 
meet  its  mission  utility.   This  is  the  purpose  of  Operating 
and  Support  Hazard  Analysis  (O&SHA)  where,  for  the  first 
time,  the  human  element  is  injected  into  the  equation.   The 
O&SHA  considers  operation,  support,  maintenance,  transporta- 
tion and  other  operational  uses  of  the  system. 

The  purpose  of  all  these  analyses  is  to  (a)  identify 
any  potential  hazards,  (b)  evaluate  the  hazards,  (c)  assess 
the  risk  of  the  hazard,  and  (d)  provide  the  necessary  infor- 
mation required  for  the  elimination  or  control  of  the  hazard 
if  the  hazard  is  determined  to  be  critical. 


III.   HAZARD  ANALYSIS  TECHNIQUES 

Although  the  techniques  to  be  used  in  conducting  hazard 
analyses  are  generally  at  the  option  of  the  contractor, 
several  analysis  techniques  that  may  be  used  are  spelled  out 
in  MIL-STD  882A.   These  are: 

1.   Fault  Tree  Analysis  -  The  Fault  Tree  is  based  on 
the  Logic  Tree  procedure  as  developed  by  the  Bell  Laboratories 
It  was  originally  modified  by  the  Boeing  Company  to  trace 
fault  developments  in  the  Minuteman  missile  system.   The 
Tree  uses  logic  "gates'*  to  build  downward  from  a  Head  or 
Undesired  Event.   Inasmuch  as  a  form  of  the  Fault  Tree 
Analysis  may  be  used  in  Software  analysis,  some  detail  of 
this  technique  is  in  order.   Consider  a  system  whose  wiring 
diagram  is  shown  below: 
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Figure  1.   Wiring  diagram  for  Fault  Tree 


"Advanced  Concepts  in  Fault  Tree  Analysis"  by  David  Haasl 
Paper  presented  at  System  Safety  Symposium,  Seattle,  WA  (1965). 


When  the  Switch  is  closed,  the  Timer  Coil  is  energized 
closing  the  Timer  Contacts  which  puts  power  on  the  Relay 
Coil.   With  the  Relay  Coil  energized,  the  relay  contacts 
close,  permitting  current  flow  to  the  motor.   It  has  been 
determined  that  overheating  of  the  wire  from  point  A  to 
point  B  is  critical  to  the  system  operation,  and  a  Fault 
Tree  is  constructed  with  the  Head  Event  "Overheated  Wire". 

Examination  of  the  system  shows  that  two  events  may 
produce  an  overheated  wire,  "Excessive  current  in  the  motor 
system  wiring"  and  "Power  applied  to  the  system  for  an  ex- 
tended time".   It  is  also  seen  that  both  of  these  occurrences 
must  happen  in  concert.   In  other  words,  the  "Overheated 
Wire"  is  a  result  of  "Excessive  Current"  AND  "Power  Applied 
for  an  Extended  Time".   The  logic  event  is,  therefore,  an 
AND  gate. 

To  determine  why  the  power  may  be  applied  for  an  ex- 
tended time,  we  examine  the  circuit  and  note  that  this  may 
occur  if  the  power  is  not  removed  from  the  relay  coil  for 
some  reason,  or  if  the  relay  contacts  fail  in  the  closed 
position.   We  now  have  the  next  branch  of  our  Tree  with  an 
OR  gate  connecting  "Power  not  removed  from  relay  coil"  and 
"Relay  contacts  fail  closed". 

Similar  development  takes  place  proceeding  downward 
from  the  Head  Event  until  one  reaches  primary  or  secondary 
failures.   Primary  failures  are  those  that  occur  while  the 
part  or  component  is  operating  within  the  parameters  for 
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which  it  was    designed,    and   a  Secondary    failure    is  /then    the 

component   is   subjected  to   abnormal,    out-of-design   stresses. 

A  complete   Fault  Tree   diagram   for   the    system  under 
consideration    is    shown   in  Figure    2. 

Analysis   of   a  Fault  Tree  may  be  either  qualitative  or 
quantitative.      If   the   failure  probabilities    for  each  of   the 
"end  faults"    of  Figure   2   are  known,    the   failure  probabili- 
ties  of  each   of   the    intermediate   branches    as  well   as    the 
failure   probability   of   the   Head  Event  may  be    computed   using 
Boolean  Algebra. 

But  even   a  qualitative   examination   of  many  Fault  Trees 
will  provide   useful   information.      With   an   AND  gate    leading 
directly   to   the   Head  Event,    it   is    seen   that   inasmuch   as 
both  of   the  next  lower   level  events  must  occur  for  the   un- 
desired  Head  Event    to  occur,    eliminating  either  of    the  next 
level  events  will  eliminate   the    failure  of  the  Head  Event. 

Since   the   gate    leading  to   the    "Power  applied  for   an  ex- 
tended  time"    is    an   OR  gate,    failure   of  either  of   the   next 
lower  level  events  will  still  produce   this   undesired  sub- 
event.      This  means   that  both  of  the  next   lower  events  must 
be   eliminated  to  prevent   this    sub-failure. 

On    the   other  hand,    the   gate    leading   to   "Excessive 
current  in  the   system  wiring"    is   also   an  AND  gate,    and  this 
sub-failure  will  only  occur   if  both  of   its   sub-level  events 
occur.      One   of   these   events,    "Motor   failed  shorted"   will 
occur  only  with   a  primary  motor  failure,    leading   us    to   the 


conclusion  that  a  high  quality,  high  reliability  motor  will 
prevent  (or  at  least  reduce  the  probability  of  the  occurrence 
of)  the  Head  Event. 

We  see  also  that  if  we  can  prevent  the  fuse  from  failing 
to  open,  we  can  also  prevent  excessive  wiring.   This  may  be 
accomplished  by  both  preventing  the  insertion  of  an  over- 
sized fuse  (design  of  the  fuse  holder)  and  prevention  of  a 
primary  fuse  failure  in  which  the  fuse  fails  to  open  with 
excessive  current.   With  most  fuse  designs,  this  type  of 
fuse  failure  should  be  extremely  remote. 

2.   Fault  Hazard  Analysis  -  A  second  technique  cited 
in  MIL-STD  8  82A  is  the  Fault  Hazard  Analysis  (FHA) ,  which 
is  based  on  the  techniques  long  in  use  by  Reliability  called 
Failure  Mode  and  Effect  Analysis  (FMEA)  or  Failure  Mode, 
Effect,  and  Criticality  Analysis  (FMECA) .   The  Fault  Hazard 
Analysis,  unlike  the  Fault  Tree  Analysis  starts  at  the 
bottom  and  works  up,  rather  than  at  the  top  working  down. 
Whereas,  in  the  Fault  Tree  Analysis  we  conducted  our 
analysis  to  determine  what  failures  would  cause  an  undesired 
event,  in  the  Fault  Hazard  Analysis,  one  starts  with  the 
part  or  component  failure  and  determines  the  ultimate  effect 
of  that  failure. 

The  Fault  Hazard  Analysis  may  be  used  in  detailed 
examination,  such  as  in  an  SHA  where  one  is  considering 
failure  modes  and  effects  of  detailed  designs,  or  it  may  be 
used  in  a  less  structured  format  in  an  overall  analysis  of 
design  concepts,  as  in  the  Preliminary  Hazard  Analysis. 
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As  with  the  Fault  Tree  Analysis,  the  Fault  Hazard 
Analysis  may  be  either  qualitative  or  quantitative.  A 
qualitative  analysis  usually  precedes  any  quantification. 

3.   Sneak  Circuit  Analysis  -  A  third  technique 
referenced  in  MIL-STD  882A  is  that  of  the  Sneak  Circuit 
Analysis.   This  is  a  technique,  originally  developed  by  the 
Boeing  Aerospace  Company,  to  locate  hazards  that  might 
occur  without  a  failure  in  the  system.   A  classic  example 
of  a  Sneak  Circuit  was  in  a  1960's  imported  car  in  which 
the  radio  and  the  brake  lights  both  received  their  power 
from  a  common  terminal  on  the  switch.   The  brake  lights  were 
also  capable  of  being  powered  through  the  emergency  flasher 
module,  and  this  portion  of  the  wiring  was  such  that  when 
the  brake  pedal  was  depressed,  even  with  the  ignition  switch 
off,  the  brake  lights  were  illuminated  -  and  the  radio  came 
on1. 

In  addition  to  these  three  techniques  cited  in  MIL-STD 
882A,  there  are  many  others,  lesser  known  analysis  techniques, 
These  include  Energy  Transfer  techniques  wherein  all  sub- 
systems and  systems  are  considered  on  the  basis  of  energy  in  - 
energy  out,  and  Resource  Utilization  techniques  that  consider 
desired  (and  undesired)  outputs  codified  on  the  basis  of  in- 
put resources. 
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IV.   SOFTWARE  ANALYSIS  -  GENERAL 

Software  reliability  predictions  are  available  that 
provide  numerical  predictions  of  how  many  errors  will  remain 
in  the  software  when  it  is  delivered,  but  these  predictions 
do  not  tell  the  effects  of  a  software  error,  where  it  will 
occur,  or  in  what  mission  phase. 

Several  analysis  techniques  have  been  used  to  give 
partial  answers  to  the  software  system  safety  questions, 
but  these  have  been,  in  general,  limited  to  individual 
disciplines  and/or  to  specific  categories  of  failure. 

There  is,  however,  one  factor  that  acts  as  a  catalyst 
for  the  solution  of  the  software  system  safety  problem, 
and  that  is  the  fact  that  any  software  difficulty  is  tran- 
slatable into  a  general  system  safety  problem  only  if  there 
is  a  hardware  involvement.   That  is  to  say,  the  only  soft- 
ware 'mishap"  that  has  safety  implication  is  one  that 
commands  a  hardware  function  at  the  wrong  time,  in  the  wrong 
sequence,  in  the  wrong  manner  or  when  the  hardware  component 
should  not  be  commanded  at  all. 

This  fact,  in  itself,  gives  rise  to  possible  software 
system  safety  techniques  in  which  the  software  'mishap'  is 
directly  related  to  hardware  'mishaps'  for  which  there  are 
well  known  and  experienced  analysis  techniques,  as  discussed 
in  Part  III.  It  would  appear,  therefore,  that  an  examina- 
tion of  each  known  hardware  mishap  for  possible  software 
command  functions  would  suffice  as  a  software  system  safety 
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analysis.   And  indeed  it  would,  except  that  this  wholesale 
approach  would  be  extremely  time  consuming  and,  as  a  result, 
extremely  costly.   One  would  also  have  to  exercise  great 
caution  to  ensure  that  software  interfaces  are  considered  in 
addition  to  the  already  considered  hardware  interfaces. 

Although  there  are  many  proponents  of  various  single 
technique  methods  for  software  system  safety  analysis,  even 
these  techniques  are  generally  modified  so  as  to  take 
advantage  of  other  techniques  in  some  degree  to  reduce  the 
cost  and  time  that  would  be  otherwise  involved. 

In  general,  software  analysis  techniques  for  system 
safety  follow  the  hardware  analysis  methods.   One  method 
is  the  'bottom  up'  method  similar  to  the  Failure  Mode  and 
Effect  Analysis  and  another  is  the  'top  down'  procedure 
related  to  the  Fault  Tree  Analysis.   Some  detailed  examina- 
tion of  these  two  methods,  as  well  as  a  combined  method  will 
be  discussed  in  subsequent  parts  of  this  summary. 
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V.   TOP  DOWN  SOFTWARE  ANALYSIS 

The  Top  Down  analysis  technique  is  based  on  the  deter- 
mination of  hazardous  termination  points  for  the  software 
program.   This  is  analogous  to  the  Undesired  Event  which  is 
the  Head  Event  of  the  hardware  Fault  Tree  Analysis. 

The  first  step  in  the  Top  Down  techniques  is  to  define 
Acceptable  Terminations  as  well  as  Hazardous  Terminations. 
Such  a  definition  list,  as  related  to  a  missile  system,  is 
as  follows: 

Acceptable  Defined  Terminations 

1.  Missile  successfully  launched. 

2.  Missile  aborted  with  booster  and  warhead  safe. 

3.  Missile  aborted  with  booster  or  warhead  not 
saved  and  with  operator  alerts. 

4.  System  cycled  to  Hold  with  operator  alerts. 

5.  System  cycled  to  Test  Mode  with  operator  alerts. 

6.  System  recycled  to  known  states  with  operator 
alerts. 

7.  System  automatically  cycled  to  power  down  with 
appropriate  operator  alerts . 

Hazardous  Terminations 

1.  Unauthorized  launch  of  a  missile. 

2.  Unintentional  launch  of  a  missile,  including  the 
launch  of  a  wrong  missile. 


Software  Safety  and  Security  Analysis  Techniques  and 
Methods,  Vitro  Laboratories,  Silver  Spring,  MD,  19  80. 
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3.  Missile  abort  without  operator  alerts „   The 
alerts  to  include  the  safety  evaluation  of  the 
aborted  missile. 

4.  A  premature  issuance  of  a  unique  pre-arm  signal. 

5.  An  abort  without  adequate  declassification. 

6.  An  unauthorized  disclosure  of  classified  data. 

7.  Any  unauthorized  or  unintentional  modification 
of  safety  and/or  security  sensitive  programs  or 
data. 

8.  Any  termination  which  is  not  a  member  of  the  set 
of  acceptably  defined  terminations. 

It  is  to  be  observed  that  several  of  the  Hazardous 
Termination  categories  go  well  beyond  the  safety  of  the 
hardware.   For  example,  Item  number  3,  "Missile  abort  with- 
out operator  alerts"  might  appear  to  have  "Fail  Safe" 
connotations  in  that  the  missile  aborted,  presumably  due  to 
proper  software/hardware  interaction.   However,  a  new  and 
unique  safety  problem  may  arise  if  the  operator  is  unaware 
of  the  abort  or  of  the  reason  for  the  abort  and  attempts  to 
operate  the  missile  again. 

With  the  hierarchial  Top  Down  development  process,  the 
software  design  cycles  is  as  shown  in  Figure  3.  As  in  the 
hardware  system  safety,  the  best  payoff  occurs  if  the  soft- 
ware system  safety  is  inserted  early  in  the  design  process. 
Although  it  is  true  that  many  systems  have  an  initial  hard- 
ware design  followed  by  the  design  of  the  control  software, 
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even   these   systems    should   undergo   a   software   analysis    for 
subsystems    and  for  system  interfaces    as   early   in   the  design 
process   as  practicable. 

For  those   subsystems   and  systems   in  which   the  software 
is   "King",    that   is    to   say,    for  systems   in  which   the  software 
is    the  principal,    and  possibly   the  first,    design   item  and 
then  the  hardware   is   designed  to  accomplish  the  desired  out- 
puts  of   the   system,   early   analysis   of  the  software   is 
essential. 

The    flow  of   control,    as    shown  in  Figure    3,    starts    at 
the   top   level,   goes   down  one  or  more   levels,    comes  back  as 
required,    and  then  goes   down  to  another  level. 

The   actual  assessment  depends  on   the   construction  of   a 
series   of  binary   trees    that  have   the   following   attributes: 

1.  There   is    a  one-to-one   correspondence  between 
decision  points   of    the   software   and  nodal  branch  points   of 
the   tree. 

2.  Each  branch   can  be   categorized  by   a   series   of 

state   transitions   that  can  be  described  by  Boolean   alogorithms. 

3.  The   origin  of  each  binary   tree  is   always   an  operation 
action,    a  machine-generated  priority   interrupt,    a  periodic 
status  monitoring,   or  a  return   to  a  higher  level  process. 

Figure    4    is    a  generalized  Binary   State   Transition  Tree 
which   illustrates   all   constructs.      Such   a   tree  will  be   examined 
for  possible   occurrences   of  system  errors    caused  by  program 
errors    (incomplete   definition  of   state  vectors   or  predicate 
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transforms)  or  by  single  or  multiple  hardware  failures. 
System  errors  are  treated  as  virtual  branch  points  in  which 
the  occurrence  and  subsequent  processing  is  also  treated  as 
a  binary  tree. 

Such  a  tree  is  called  a  Binary  Fault  Termination  Tree 
and  is  also  shown  in  Figure  4.   This  tree  has  end  points 
that  may  describe  hazardous  terminations,  and,  as  such,  is 
of  prime  interest  in  software  system  safety  analysis. 

As  has  been  previously  stated,  the  only  software 
'mishap'  that  is  of  concern  in  the  system  safety  analysis 
is  one  that  produces  an  undesired  hardware  event.   For  this 
reason,  software  errors  that  occur  but  are  trapped  and 
contained  and  do  not  produce  hazardous  terminations  are  of 
but  secondary  interest  in  the  safety  analysis.   The  Binary 
Fault  Termination  Trees  are  used  to  evaluate  whether 
checkpoints  are  established  to  trap  and  contain  the  software 
errors. 

When  errors  have  been  identified  and  the  processing 
shows  that  a  hazardous  termination  is  being  approached, 
the  process  may  be  translated  into  a  probability  function 
to  determine  the  probability  of  the  hazardous  termination. 
This  translation  implies  that  the  following  probabilities 
can  be  computed: 

1.  The  probability  that  the  system  moves  into  the 
implied  mode. 

2.  The  probability  that  the  system  moving  from  the 
implied  mode  misses  the  illegal  procedure  trap. 
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VI.   BOTTOM  UP  SOFTWARE  ANALYSIS 

Another  technique  that  is  used  for  software  analysis 
is  the  Bottom  Up  Analysis.   This  technique  is  based  on  the 
predication  that  any  hazard  that  is  created  or  allowed  to 
propogate  must  exist  in  hardware.   Because  of  this,  the 
software  safety  analysis  should  analyze,  within  the  software 
boundaries,  concerns  identified  within  the  hardware  facets. 
This  stipulates  that  the  software  analysis  is  an  extension  of 
the  Preliminary  Hazard  Analysis  (PHA) . 

Inasmuch  as  the  Subsystem  Hazard  Analysis  (SSHA)  and 
the  System  Hazard  Analysis  (SHA)  follow  the  Preliminary 
Hazard  Analysis,  the  command  and  control  software  development 
and  analysis  should  follow  the  basic  development  of  the  hard- 
ware that  is  to  be  controlled. 

This  offers  the  use  of  the  techniques  similar  to  the 
Fault  Hazard  Analyses  of  hardware  system  safety.   This 
technique,  which  is  akin  to  the  long-used  Reliability  tech- 
niques of  Failure  Mode  and  Effect  Analysis  (FMEA)  or  the 
Failure  Mode,  Effect  and  Criticality  Analysis  (FMECA) , 
considers  the  undesired  outcomes  of  the  failure  of  a  sub- 
system or  component. 

The  software  analyses  should  address  hazards  resulting 
from  basic  deficiencies  in  the  requirements,  the  software 
program  design,  the  internal  coding,  the  software  testing, 
the  user  interfaces,  the  backup  software  functions  and  the 
hardware/software  interfaces. 
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The   general   approach   for  this    technique   is    as    follows: 

1.  Divide   the   system  into  portions   to  gain  insight  into 
sub-function  operations. 

2.  Conduct  the  hardware  system  safety  analysis. 

3.  Conduct  the  software  system  safety  analysis. 

a.  Analyze  for  correct  implementation  of  the 
hardware  functional  requirements  and  interfaces. 

(1)  Review  the  functional  requirements 
of  the  subsystems  being  controlled. 

(2)  Analyze  the  requirements  of  the 
software  definition  and  implementation  to  ensure  that 
the  software  is  kept  in  a  safe  configuration. 

b.  Analyze  for  internal  software  anomalies  which 
would  affect  the  execution  of  the  software. 

(1)  Analyze  those  functions  that  have  been 
designated  safety  critical. 

(2)  Analyze  those  functions  that  might  affect 
the  execution  of  the  safety  critical  functions. 

(3)  Review  for  implementation  of  errors  or 
anomalies  in  the  detailed  requirements,  the  detailed  program 
design  and/or  the  coding. 

(4)  Review  the  timing  of  competing/overlapping 
functions. 


Software  Safety  Analysis,  A  Presentation  by  John  G. 
Griggs,  Martin  Marietta  Corp.  at  Tri-Service  Safety  Conference 
Colorado  Springs,  CO,  September  19  80. 
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(5)  Review  the  use  of  incorrect  data  and/or 
changing  data. 

(6)  Review  the  use  of  coding  techniques  to 
minimize  the  effects  of  errors. 


22 


VII.   COMBINED  TECHNIQUE 

A  combination  of  the  Top  Down  and  the  Bottom  Up 
Techniques  has  been  developed  by  the  Boeing  Aerospace  Company 
into  a  technique  called  Integrated  Path  Critical  Analysis 
(ICPA) . 

The  seven  step  ICPA  combines  Fault  Tree  Analysis  for  the 
identification  of  critical  paths,  Failure  Mode  and  Effect 
Analysis  for  component  failure  rates,  Sneak  Circuit  Analysis 
for  actual  detailed  system  configuration,  Component  Sensi- 
tivity Analysis  for  the  determination  of  areas  of  emphasis 
and  Critical  Path  Analysis  to  evaluate  and  quantify  overall 
system  reliability  and  safety. 

The  steps  outlined  by  Boeing  Aerospace  for  this 
technique  are  as  follows: 

1.  Determine  the  Scope  of  the  Analysis. 

a.   This  is  usually  accomplished  by  reviewing  the 
top  level  hazards  and/or  critical  functions  and  then  deciding 
which  ones  should  be  analyzed  in  detail  and  to  what  level. 
This  may  be  done  by  using  existing  Fault  Trees  and  FMEA's 
and  marking  them  for  particular  emphasis. 

2.  Establishment  of  Accurate  System  Configuration  in 
the  form  of  Integrated  Functional  Network  Trees. 


Software/Hardware  Integrated  Critical  Path  Analysis 
(ICPA)  by  J,  H,  Campbell  and  F.  H.  Tuna  in  Proceedings  of 
Fourth  International  System  Safety  Conference,  July  §-13,  19  79 
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a.   It  is  essential  that  the  analysis  be  conducted 
on  an  accurately  represented  system.   The  use  of  logic  trees 
assists  in  the  determination  of  the  accuracy  of  the  system 
configuration  under  examination. 

3.  Updating  and  Integrating  Fault  Trees. 

a.   Here  one  applies  the  material  generated  from 
Step  #2  and  adds  the  software  network  threats  which  change 
the  impact  of  the  hardware  due  to  usage  and/or  interconnect 
hardware,  thereby  changing  the  hardware  interrelationships. 

4.  Reliability  Prediction  of  Critical  Paths. 

a.  Once  the  critical  paths  have  been  defined,  the 
reliability  numbers  can  be  affixed  to  present  a  reliability 
prediction. 

5.  Procedure  Analysis. 

a.      This  step  considers   all  procedures   including 
test,    contingency  and  backup. 

6.  Failure  Effect  Analysis. 

a.   The  previous  steps  have  emphasized  the  possible 
critical  steps  and  paths,  and  these  functions  are  now  analyzed. 

7.  Analyses  Reports. 

a.   These  reports  are  generated  to  permit  management 
decisions  in  regard  to  changes  and  alterations  to  the  suspect 
subsystems  and  systems.   These  reports  should  contain  the 
revised  Fault  Trees,  sensitivity  statements,  hardware/software 
interaction  problems  and  the  failure  effects  of  critical  paths. 
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VIII.   CONCLUSIONS 

The  subject  of  software  system  safety  is  relatively  new. 
Concern  in  the  Electronic  Industries  Association  (EIA)  was 
first  expressed  about  a  year  and  a  half  ago  and  the  EIA 
formed  a  task  force  to  develop  a  generic  approach  to  the 
analysis  of  software  system  safety  in  August  1979. 

This  action  is  significant  due  to  the  activity  of  the 
G-4  8  Committee  of  the  EIA  in  developing  codified  System 
Safety  requirements  and  fostering  the  development  of  System 
Safety  analysis  techniques. 

There  appear  to  be  techniques  available  for  the  analysis 
of  System  Safety  software  problems,  and  these  techniques  are, 
in  general,  based  on  the  proven  techniques  used  in  hardware 
System  Safety  analysis. 

Despite  the  particular  technique  to  be  used,  it  appears 
that  software  analysis  parallel  with  software  development  has 
the  greatest  payoff  at  this  time. 

Although  the  analysis  techniques  in  greatest  use  to  date 
have  been,  at  the  most,  a  modification  of  the  hardware  tech- 
niques, there  are  people  working  on  different  techniques  which 
have,  as  yet,  not  demonstrated  any  spectacular  results.   This 
is  not  to  say,  however,  that  such  methods  are  not  to  be  closely 
monitored,  because  it  is  quite  possible  that  these,  rather 
radical,  methods  may  yet  have  a  significant  payoff. 


25 


DISTRIBUTION 


1.  Library 
Code  0142 

Naval  Postgraduate  School 
Monterey,  CA   9  39  40 

2.  Research  Administration 
Code  012 A 

Naval  Postgraduate  School 
Monterey,  CA  9  39  40 

3.  Chief  of  Naval  Research 
Arlington,  VA  22217 


NO.  OF  COPIES 


5. 


Professor  Donald  M.  Layton 
Code  67Ln 

Naval  Postgraduate  School 
Monterey,  CA  9  39  40 

Defense  Technical  Information  Center 
Cameron  Station 
Alexandria, Virginia  22314 


U195615 


26 


DUDLEY  KNOX  LIBRARY  -  RESEARCH  REPORTS 


5  6853  01068162  0 


